142 research outputs found

    Mathematical Optimization Algorithms for Model Compression and Adversarial Learning in Deep Neural Networks

    Get PDF
    Large-scale deep neural networks (DNNs) have made breakthroughs in a variety of tasks, such as image recognition, speech recognition and self-driving cars. However, their large model size and computational requirements add a significant burden to state-of-the-art computing systems. Weight pruning is an effective approach to reduce the model size and computational requirements of DNNs. However, prior works in this area are mainly heuristic methods. As a result, the performance of a DNN cannot maintain for a high weight pruning ratio. To mitigate this limitation, we propose a systematic weight pruning framework for DNNs based on mathematical optimization. We first formulate the weight pruning for DNNs as a non-convex optimization problem, and then systematically solve it using alternating direction method of multipliers (ADMM). Our work achieves a higher weight pruning ratio on DNNs without accuracy loss and a higher acceleration on the inference of DNNs on CPU and GPU platforms compared with prior works. Besides the issue of model size, DNNs are also sensitive to adversarial attacks, a small invisible noise on the input data can fully mislead a DNN. Research on the robustness of DNNs follows two directions in general. The first is to enhance the robustness of DNNs, which increases the degree of difficulty for adversarial attacks to fool DNNs. The second is to design adversarial attack methods to test the robustness of DNNs. These two aspects reciprocally benefit each other towards hardening DNNs. In our work, we propose to generate adversarial attacks with low distortion via convex optimization, which achieves 100% attack success rate with lower distortion compared with prior works. We also propose a unified min-max optimization framework for the adversarial attack and defense on DNNs over multiple domains. Our proposed method performs better compared with the prior works, which use average-based strategies to solve the problems over multiple domains

    RXFOOD: Plug-in RGB-X Fusion for Object of Interest Detection

    Full text link
    The emergence of different sensors (Near-Infrared, Depth, etc.) is a remedy for the limited application scenarios of traditional RGB camera. The RGB-X tasks, which rely on RGB input and another type of data input to resolve specific problems, have become a popular research topic in multimedia. A crucial part in two-branch RGB-X deep neural networks is how to fuse information across modalities. Given the tremendous information inside RGB-X networks, previous works typically apply naive fusion (e.g., average or max fusion) or only focus on the feature fusion at the same scale(s). While in this paper, we propose a novel method called RXFOOD for the fusion of features across different scales within the same modality branch and from different modality branches simultaneously in a unified attention mechanism. An Energy Exchange Module is designed for the interaction of each feature map's energy matrix, who reflects the inter-relationship of different positions and different channels inside a feature map. The RXFOOD method can be easily incorporated to any dual-branch encoder-decoder network as a plug-in module, and help the original backbone network better focus on important positions and channels for object of interest detection. Experimental results on RGB-NIR salient object detection, RGB-D salient object detection, and RGBFrequency image manipulation detection demonstrate the clear effectiveness of the proposed RXFOOD.Comment: 10 page
    • …
    corecore